Steganalysis in high dimensions: fusing classifiers built on random subspaces

نویسندگان

  • Jan Kodovský
  • Jessica J. Fridrich
چکیده

By working with high-dimensional representations of covers, modern steganographic methods are capable of preserving a large number of complex dependencies among individual cover elements and thus avoid detection using current best steganalyzers. Inevitably, steganalysis needs to start using high-dimensional feature sets as well. This brings two key problems – construction of good high-dimensional features and machine learning that scales well with respect to dimensionality. Depending on the classifier, high dimensionality may lead to problems with the lack of training data, infeasibly high complexity of training, degradation of generalization abilities, lack of robustness to cover source, and saturation of performance below its potential. To address these problems collectively known as the curse of dimensionality, we propose ensemble classifiers as an alternative to the much more complex support vector machines. Based on the character of the media being analyzed, the steganalyst first puts together a high-dimensional set of diverse “prefeatures” selected to capture dependencies among individual cover elements. Then, a family of weak classifiers is built on random subspaces of the prefeature space. The final classifier is constructed by fusing the decisions of individual classifiers. The advantage of this approach is its universality, low complexity, simplicity, and improved performance when compared to classifiers trained on the entire prefeature set. Experiments with the steganographic algorithms nsF5 and HUGO demonstrate the usefulness of this approach over current state of the art.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Effective Ensemble-based Classification Algorithm for High-Dimensional Steganalysis

Recently, ensemble learning algorithms are proposed to address the challenges of high dimensional classification for steganalysis caused by the curse of dimensionality and obtain superior performance. In this paper, we extend the state-of-the-art steganalysis tool developed by Kodovsky and Fridrich: the Kodovsky’s ensemble classifier and propose a novel method, called CSRS for high-dimensional ...

متن کامل

Semi-supervised classification based on random subspace dimensionality reduction

Graph structure is vital to graph based semi-supervised learning. However, the problem of constructing a graph that reflects the underlying data distribution has been seldom investigated in semi-supervised learning, especially for high dimensional data. In this paper, we focus on graph construction for semisupervised learning and propose a novel method called Semi-Supervised Classification base...

متن کامل

Towards dependable steganalysis

This paper considers the research goal of dependable steganalysis: where false positives occur once in a million or less, and this rate is known with high precision. Despite its importance for real-world application, there has been almost no study of steganalysis which produces very low false positives. We test existing and novel classifiers for their low false-positive performance, using milli...

متن کامل

Steganalysis with Classifier Combinations

Blind steganalysis is based on choice of the feature set and the machine learning classifiers used for classification. While the performance of individual classifiers is good, the classification accuracy is seen to increase by appropriate combination of classifiers. This research has implemented image steganalysis with fusion of classifiers by various data fusion schemes. We intend to analyse t...

متن کامل

Going from small to large data in steganalysis

With most image steganalysis traditionally based on supervised machine learning methods, the size of training data has remained static at up to 20000 training examples. This potentially leads to the classifier being undertrained for larger feature sets and it may be too narrowly focused on characteristics of a source of cover images, resulting in degradation in performance when the testing sour...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011